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Let J be a separable uniformly bounded family of measurable functions on a 
standard measurable space (X, X), and let Nn e, n) be the smallest num- 
ber of e-brackets in L (n) needed to cover The following are equivalent: 

1. 3 is a universal Glivenko-Cantelli class. 

2. Nri (J, e, y) < oo for every e > and every probability measure jj,. 

3. 3" is totally bounded in L 1 (n) for every probability measure fi. 

4. 3" does not contain a Boolean a-independent sequence. 

It follows that universal Glivenko-Cantelli classes are uniformity classes for 
general sequences of almost surely convergent random measures. 

1. Main results. Let (X, X) be a measurable space, and let 3" be a family of 
measurable functions on (X, X). Given a probability measure fi on {X, X), the 
family 3" is said to be a ^-Glivenko-Cantelli class (cf. [31] or [13, section 6.6]) if 
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where (X^^i is the i.i.d. sequence of X-valued random variables with distribu- 
tion /i, defined on its canonical product probability space. 1 The class 3" is said to 
be a universal Glivenko-Cantelli class if it is ^-Glivenko-Cantelli for every proba- 
bility measure \x on (X, X). The goal of this paper is to characterize the universal 
Glivenko-Cantelli property in the case that 3" is separable and (X, X) is a standard 
measurable space (these regularity assumptions will be detailed below). Somewhat 
surprisingly, we find that universal Glivenko-Cantelli classes are in fact uniformity 
classes for convergence of (random) probability measures under the assumptions 
of this paper, so that their applicability extends substantially beyond the setting of 
laws of large numbers for i.i.d. sequences that is inherent in their definition. 



"This work was partially supported by NSF grant DMS- 1005575. 
AMS 2000 subject classifications: 60F15, 60B 10, 41 A46 

Keywords and phrases: universal Glivenko-Cantelli classes, uniformity classes, uniform conver- 
gence of random measures, entropy with bracketing, Boolean independence 

1 The supremum in the definition of the /i-Glivenko-Cantelli property need not be measurable in 
general when the class is uncountable. However, measurability will turn out to hold in the setting 
of our main results as a consequence of the proofs. See section 3.5 below for further discussion. 
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The following probability-free independence properties for families of functions 
will play a fundamental role in this paper. These notions date back to Marczewski 
[23] (for sets) and Rosenthal [27] (for functions, see also [8]). 

Definition 1.1. A family 3" of functions on a set X is said to be Boolean 
independent at levels (a, /3) if for every finite subfamily {fx, . . . , f n } Q 3~ 

Clifj <»}nf~]{f j >(3}^0 foreveryFC {l,...,n}. 

jeF j<£F 

A sequence {fi)i^n is said to be Boolean a -independent at levels (a, (3) if 
P| {fj < a} n p| {fj > P} ^ for every F C N. 

A family (sequence) of functions is called Boolean (<7-)independent if it is Boolean 
((T-)independent at levels (a, (3) for some a < /3. 

We also recall the well-known notions of bracketing and covering numbers. 

Definition 1.2. Let 3" be a class of functions on a measurable space (X, X). 
Given e > and a probability measure fi on (X, X), a pair of measurable functions 
/ + , /" such that f~ < f + pointwise and f-i(f + — f~) < e defines an e-bracket 
in L 1 ^) [/-,/+] := {f : r < f < f + pointwise}. Denote by N^,e,(i) the 
cardinality of the smallest collection of e-brackets in covering 3", and by 

N(J, e, p) the cardinality of the smallest covering of 3" by e-balls in L l {p). 

A measurable space (X, X) is said to be standard if it is Borel-isomorphic to 
a Polish space. A class of functions 3" on a set X will be said to be separable if 
it contains a countable dense subset for the topology of pointwise convergence in 
K x . 2 We can now formulate our main result. 

THEOREM 1.3. Let 3~ be a separable uniformly bounded family of measurable 
functions on a standard measurable space (X, X). The following are equivalent: 

1 This notion of separability is not commonly considered in empirical process theory. A sequen- 
tial counterpart is more familiar: 3" is called pointwise measurable if it contains a countable subset 
3^ such that every / G 3 is the pointwise limit of a sequence in 3 (cf. [33, Example 2.3.4]). In gen- 
eral, separability is much weaker than pointwise measurability. However, a deep result of Bourgain, 
Fremlin and Talagrand [8, Theorem 4D(viii)=>(vi)] implies that a separable uniformly bounded fam- 
ily of measurable functions on a standard space is necessarily pointwise measurable if it contains no 
Boolean cr-independent sequence. Thus universal Glivenko-Cantelli classes satisfying the assump- 
tions of Theorem 1 .3 below are always pointwise measurable, though this is far from obvious a priori. 
This fact will not be needed in our proofs. 
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1. 3" is a universal Glivenko-Cantelli class. 

2. Nn (3~, e, /i) < oo for every e > and every probability measure p,. 

3. N(3", e, p) < oo for every e > a«<i every probability measure p. 

4. 3" contains no Boolean a -independent sequence. 

A notable aspect of this result is that the four equivalent conditions of Theorem 
1.3 are quite different in nature: roughly speaking, the first condition is probabilis- 
tic, the second and third are geometric and the fourth is combinatorial. 

The implication 1 2 in Theorem 1.3 is the most important result of this paper. 
A consequence of this implication is that universal Glivenko-Cantelli classes can 
be characterized as uniformity classes in a much more general setting. 

COROLLARY 1.4. Under the assumptions of Theorem 1.3, the following are 
equivalent to the equivalent conditions 1-4 of Theorem 1.3: 

5. For any probability measure p on (X, X) and net of probability measures 
(fi T ) T( zi such that p T — > p setwise, we have supj g j \p T {f) — /•*(/) I - ► 0. 

6. For any probability measure p on (X, X) and sequence of random proba- 
bility measures (kernels) (p n ) n< z^ such that p n (A) — > fJ-(A) a.s. for every 
A € X, we have supj g3 r \p n {f) — ~* a - s - 

7. For any countably generated reverse filtration (S-n)neN an d X -valued ran- 
dom variable Z, supj- gg r |Pg_ n (f(Z)) — Pg_ oc (f(Z))\ — > a.s. 

8. For any strictly stationary sequence {Z n ) n ^ of X -valued random variables, 
supj eg r |i 2~2k=i f(Zk) — Pj(/(^o))| - > a.s. (3 is the invariant a-field). 

Here Pg denotes any version of the regular conditional probability P[ • |S]. 

The characterization provided by Theorem 1.3 and Corollary 1.4 is proved under 
three regularity assumptions: that 3 is uniformly bounded and separable, and that 
(X, X) is standard. It is not difficult to show that any universal Glivenko-Cantelli 
class is uniformly bounded up to additive constants (see, for example, [15, Propo- 
sition 4]), so that the assumption that 3" is uniformly bounded is not a restriction. 
We will presently argue, however, that without the remaining two assumptions a 
characterization along the lines of this paper cannot be expected to hold in general. 

In the case that 3" is not separable, there are easy counterexamples to Theo- 
rem 1.3. For example, consider the class 3" consisting of all indicator functions 
of finite subsets of X. It is clear that this class is not /i-Glivenko-Cantelli for any 
nonatomic measure p, yet condition 3 of Theorem 1.3 holds. Conversely, [2, sec- 
tion 1.2] gives a simple example of a universal Glivenko-Cantelli class (in fact, 
a Vapnik-Chervonenkis class that is image admissible Suslin, cf. [13, Corollary 
6.1.10]) for which condition 8 of Corollary 1.4, and therefore condition 2 of Theo- 
rem 1.3, are violated. 
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In the case that (X, X) is not standard, an easy counterexample to Theorem 1.3 
is obtained by choosing X = [0, 1] and X = 2 X . Assuming the continuum hypoth- 
esis, nonatomic probability measures on (X, X) do not exist [14, Theorem C.l], 
so that any uniformly bounded family of functions is trivially universal Glivenko- 
Cantelli. But we can clearly choose a uniformly bounded Boolean a-independent 
sequence 3" of functions on X, in contradiction to Theorem 1.3. This example is 
arguably pathological, but various examples given by Dudley, Gine and Zinn [15] 
show that such phenomena can appear even in Polish spaces if we admit universally 
measurable functions. Therefore, in the absence of some regularity assumption on 
(X, X), the universal Glivenko-Cantelli property can be surprisingly broad. In Ap- 
pendix C, we show that it is consistent with the usual axioms of set theory that the 
implications in Theorem 1.3 whose proof relies on the assumption that (X, X) is 
standard may fail in a general measurable space. I do not know whether it is pos- 
sible to obtain examples of this type that do not depend on additional set-theoretic 
axioms. 

For the case where (X, X) is a general measurable space we will prove the fol- 
lowing quantitative result, which is of independent interest. 

Definition 1.5. Let 7 > 0. A family 3" of functions on a set X is said to 
^-shatter a subset Xq C X if there exist levels a < f3 with /3 — a > 7 such that, 
for every finite subset {xi, . . . , x n } C Xq, the following holds: 

VF C {1, . . . , n}, 3 f € 3" so that f( Xj ) < a for j G F, f( Xj ) > (3 for j F. 

The 7 -dimension of 3" is the maximal cardinality of 7-shattered finite subsets of X. 

THEOREM 1.6. Let 'J be a separable uniformly bounded family of measurable 
functions on a measurable space (X, X), and let 7 > 0. Consider: 

a. 3 has finite ^-dimension. 

b. No sequence in 3~ is Boolean independent at levels (a, f3) with f3 — a > 7. 

c. Nn (J, e, fx) < 00 for every e > 7 and every probability measure fx. 

Then the implications a =^ b c hold. 

The notion of 7-dimension appears in Alon et al. [5] (called V^/ 2 -dimension 
there). The implication a =>■ c of Theorem 1.6 contains the recent results of Adams 
and Nobel [1, 3, 2], Let us note that condition b is strictly weaker than condi- 
tion a: for example, the class 3" = {lc ■ C is a finite subset of N} has infinite 
7-dimension for 7 < 1, but does not contain a Boolean independent sequence. 
Similarly, condition c is strictly weaker than condition b: if X = {x € {0, 1} N : 
linin^oo x n = 0} and J = {l{ x ex-.Xj=i} '• 3 S N}, then J contains a Boolean 
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independent sequence, but all the bracketing numbers are finite as X is count- 
able (note that 3" does not contain a Boolean a-independent sequence, so there is 
no contradiction with Theorem 1.3). Condition b is dual (in the sense of Assouad 
[7]) to the nonexistence of a 7-shattered sequence in X. A connection between 
the latter and the universal Glivenko-Cantelli property for families of indicators is 
considered by Dudley, Gine and Zinn [15]. 

An interesting question arising from Theorem 1.6 is as follows. If 9" is uniformly 
bounded and has finite 7-dimension for all 7 > 0, then sup M N(JF, 7, //) < 00 for 
all 7 > 0, that is, the covering numbers of 3" are bounded uniformly with respect 
to the underlying probability measure (see [25] for a quantitative statement). If 3" 
is a family of indicators, we have in fact the polynomial bound sup M N(J, e, p) < 
e~ d [13, Theorem 4.6.1]. In view of Theorem 1.6, one might ask whether one 
can similarly obtain uniform or quantitative bounds on the bracketing numbers of 
3". Unfortunately, this is not the case: iVn (3", e, p) can blow up arbitrarily quickly 
as e I 0. The following result is based on a combinatorial construction of Alon, 
Haussler, and Welzl [6]. 

Proposition 1.7. There exists a countable class C of subsets o/N, whose 
Vapnik-Chervonenkis dimension is two (that is, the ^-dimension of {1q ■ C € S} 
is two for all < 7 < 1) such that the following holds: for any function n(e) t 00 
as e X 0, there is a probability measure p on N such that Nn(G, e, p) > n(e) for 
all < £ < 1/3. In particular, sup^ iVp (C, e, p) = 00 for all < e < 1/3. 

Probabilistically, this result has the following consequence. In contrast to the 
universal Glivenko-Cantelli property, it is known that both the uniform Glivenko- 
Cantelli property and the universal Donsker property are equivalent to finiteness 
of the Vapnik-Chervonenkis dimension for image admissible Suslin classes of sets 
(see [13], p. 225 and p. 215, respectively). These results are proved using sym- 
metrization arguments. In view of Theorem 1.6, one might expect that it is possible 
to provide an alternative proof of these results for separable classes using brack- 
eting methods (as in [13, Chapter 7]). However, this would require either uniform 
or quantitative control of the bracketing numbers, both of which are ruled out by 
Proposition 1.7. 

The original motivation of the author was an attempt to characterize uniformity 
classes for reverse martingales that appear in filtering theory. In a recent paper, 
Adams and Nobel [2] showed that Vapnik-Chervonenkis classes of sets are uni- 
formity classes for the convergence of empirical measures of stationary ergodic 
sequences; their proof could be extended to more general random measures. A 
simplified argument, which makes the connection with bracketing, appeared sub- 
sequently in [3]. While attempting to understand the results of [2], the author real- 
ized that the techniques used in the proof are closely related to a set of techniques 
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developed by Bourgain, Fremlin and Talagrand [8, 30] to study pointwise compact 
sets of measurable functions. The proof of Theorem 1.3 is based on this elegant 
theory, which does not appear to be well known in the probability literature (how- 
ever, the proofs of our main results, Theorem 1.3, Corollary 1.4, and Theorem 1.6, 
are intended to be essentially self-contained). 

A key innovation in this paper is the construction in section 2 of a "weakly 
dense" set which allows to prove the implication 4 =^> 2 in Theorem 1.3 (and b c 
in Theorem 1.6). This result is the essential step that closes the circle of implica- 
tions in Theorem 1.3 and Corollary 1.4. Many of the remaining implications are es- 
sentially known, albeit in more restrictive settings and/or using significantly more 
complicated proofs: these results are unified here in what appears to be (in view the 
simplicity of the proofs and the counterexamples above and in Appendix C) their 
natural setting. In a topological setting (continuous functions on a compact space), 
the equivalence of 1, 3, 4 in Theorem 1.3 can be deduced by combining [30, The- 
orem 14-1-7] with Talagrand's characterization of the /x-Glivenko-Cantelli prop- 
erty [30, Theorem 11-1-1], [31] (note that in this setting the distinction between 
Boolean independent and cr-independent sequences is irrelevant). The equivalence 
between 3, 4 in Theorem 1.3 is also obtained in [8, Theorem 4D] by a much more 
complicated method. The implication 5 =^ 2 follows from the characterization 
of uniformity classes for setwise convergence of Stute [29] and Tops0e [32]. The 
implications 2 =^ 1,5-8 follow from the classical Blum-DeHardt argument, up 
to measurability problems that are resolved here. Finally, the implication a => c 
(but not b =4> c) of Theorem 1.6 is shown in [3] for the special case of Vapnik- 
Chervonenkis classes of sets. 

The remainder of this paper is organized as follows. We first prove Theorem 
1.6 in section 2. The proofs of Theorem 1.3, Corollary 1.4, and Proposition 1.7 
are subsequently given in sections 3, 4, and 5, respectively. Finally, Appendix A 
and Appendix B develop some properties of Boolean cr-independent sequences 
and decomposition theorems that are used in the proofs of our main results, while 
Appendix C is devoted to the aforementioned counterexamples to Theorem 1.3 in 
nonstandard spaces. 

2. Proof of Theorem 1.6. In this section, we fix a measurable space (X, X) 
and a separable uniformly bounded family of measurable functions 3". Let 3~o ^ 3~ 
be a countable family that is dense in CF in the pointwise convergence topology. 

Definition 2.1. Denote by Il(X, X) the collection of all finite measurable 
partitions of X. For n, n' G n(X, X), we write ir H ir' if ir is finer than ir' '. For any 
pair of sets A,5 g 1, finite partition ir € H{X, X), and probability measure fi on 
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(X, X), define the /^-essential 7r-boundary of (A, B) as 

d£(A, B) = |J{P G vr : fjt(P n A) > and /x(P nB)> 0}. 
We begin by proving an approximation result. 

LEMMA 2.2. Let fibe a probability measure on (X, X) and let 7 > 0. If 

inf sup /z(S£({/< a}, {/>/?})) =0 /or a// /? - a > 7, 

7ren(x,x) f&0 

then iVg (3", e, /i) < 00 for every e > 7. 

PROOF. There is clearly no loss of generality in assuming that every / £ J 
takes values in [0, 1] and that 7 < 1. Fix k > 1, and let 5 := 7/fc. Choose 7r € 
II(X, X) so that 

sup /x (S(/))< 6, «(/):= |J ^({/<j(5},{/>i5 + 7}). 

■^ e:Fo i<i<L^ _1 J 

For each / € 3~o, define the functions / + and /~ as follows: 

8 IS' 1 ] l H(/) + 2 5 15" 1 ess sup P /] 1 P , 

Pen:P£~(f) 

5 [5 1 essinfp /J lp. 

PG7r:P2 = (/) 

Here ess supp / (ess inf p /) denotes the essential supremum (infimum) of / on 
the set P with respect to \i. By construction, /" < / < / + outside a /i-null set 
and /u(/ + — /~) < 7 + 35. Moreover, as / + , / _ are constant on each P € it and 
take values in the finite set {j 6 : < j < [c^^ 1 ] }, there is only a finite number of 
such functions. As 3~o is countable, we can eliminate the null set to obtain a finite 
number of (7 + 3<5)-brackets in L 1 ^) covering 3~o- But 3~o is pointwise dense in 
3", so Nn (3~, 7 + 35, fj) < 00, and we may choose 5 = 7/fc arbitrarily small. □ 

To proceed, we need the notion of a "weakly dense" set, which is the measure- 
theoretic counterpart of the corresponding topological notion defined in [8]. 

Definition 2.3. Given a measurable set A € X and a probability measure /j, 
on (X, X), the family of functions 3~ is said to be [i-weakly dense over A at levels 
(a, (3) if n(A) > and for any finite collection of measurable sets B\, . . . , B p G 
X such that /j(A n Bj) > for all 1 < i < p, there exists / G 3~ such that 
H(A n Bi n {/ < a}) > and p,(A n B { n {/ > > for all 1 < i < p. 
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The key idea of this section, which lies at the heart of the results in this paper, is 
that we can construct such a set if the bracketing numbers fail to be finite. The proof 
is straightforward but requires some elementary topological notions: the reader 
unfamiliar with nets is referred to the classic text [20], while weak compactness of 
the unit ball in L 2 follows from Alaoglu's theorem [12, Theorem V.3.1]. 

PROPOSITION 2.4. Suppose there exists a probability measure fj> on (X, X) 
such that Nn (5F, e, p) = oo for some e > 7. Then there exist a < /3 with f3 — a > 7 
and a measurable set A € X such that 3~o is /i-weakly dense over A at levels (a, ft). 

PROOF. By Lemma 2.2, there exist a < ft with j3 — a > 7 such that 

inf sup /i(^({/< «},{/> /?})) >0. 
iren(x,x) f G3 r 

Choose for every 7r E n(X, X) a function f n G 3~o sucn tna t 

fi{d%({f v < a}, {U > (3})) > \ sup n(d£({f < a}, {/ > /?})) . 

Define A w := d%({f„ < a}, {f„ > f}}). Then (l^ 7r ) 7r6 n(X,X) is a net of random 
vaiiables in the unit ball of L 2 (ii). By weak compactness, there is for some directed 
set T a subnet (1^4 , . ) t ^t that converges weakly in L 2 {pL) to a random variable 
H. We claim that 3~o is /i- weakly dense over A := {H > 0} at levels (a, f3). 

To prove the claim, let us first note that as inf x p,(A w ) > 0, clearly fJ,(A) > 0. 
Now fix B\, . . . , Bp 6 X such that p,(A n Bi) > for all i. This trivially implies 
that [i(HlAnB t ) > for all i, so we can choose To € T such that 

M(4t( t ) r)AnBi)>0 Vl<i<p,r^T . 

Let 7ro be the partition generated by A, Bi, . . . , and choose r* € T such that 
t* ^ to and 7r* := vt(t*) H 7To. As ^4 n is a union of atoms of 7r* by construc- 
tion, p,(A n * Pi An Bi) > must imply that An Bi contains an atom P € it* such 
that fi(P n < a}) > and p(P n > /?}) > 0. Therefore 

n Bj n {/„-* < q}) > and n Bi n {/„* > £}) > Vt. 

Thus 9"o is /i-weakly dense over A at levels (a, /3) as claimed. □ 

We can now complete the proof of Theorem 1.6. 

Theorem 1.6. 

a => b: Lemma A.3 in Appendix A shows that if 3 contains a subset of car- 
dinality 2 n that is Boolean independent at levels (a, j3) with j3 — a > 7, then 3~ 
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7-shatters a subset of X of cardinality n. Therefore, if condition b fails, there exist 
7-shattered finite subsets of X of arbitrarily large cardinality, in contradiction with 
condition a. 

b =>■ c: Suppose that condition c fails. By Proposition 2.4, there exist a proba- 
bility measure //, levels a < fi with fi — a > 7, and a set A G X so that 3o is 
//-weakly dense over A at levels (a, We now iteratively apply Definition 2.3 
to construct a Boolean independent sequence. Indeed, applying first the definition 
with p = 1 and B\ = X, we choose f\ G 3~o so that /i(A n {/1 < a}) > and 
fj,(A n {/1 > /3}) > 0. Then applying the definition with p = 2 and Bi = {/1 < 
a}, B 2 = {fi > fi}, we choose f 2 G 3"o so that //(An {/1 < a}n{/ 2 < a}) > 0, 
//(A n {/1 < a} n {/ 2 > /?}) > 0, //(A n {h > /?} n {/ 2 < a}) > 0, and 
fi(A n {fi > /3} n {/2 > > 0. Repeating this procedure yields the desired 
sequence (/i)ieN- □ 

3. Proof of Theorem 1.3. Throughout this section, we fix a standard measur- 
able space (X, X) and a separable uniformly bounded family of measurable func- 
tions 3". We will prove Theorem 1.3 by proving the implications 1 =J> 4 ^> 2 1 

and 2 => 3 => 4. 

3.1. 1 =>■ 4. Suppose there exists a sequence (/j)t<=N C 3" that is Boolean 
ex-independent at levels (a, fi) for some a < (3. Clearly we must have 

k_ < a < fi < «_|_, k_ := inf inf /(cc), k+ := sup sup /(x). 

Letp= (k + — /3 + £)/(k + — a), where we choose e > Osuchthatp< 1. Applying 
Theorem A.l in Appendix A to the sets A; L = {fi < a} and Bi = {fi > fi}, 
there exists a probability measure // on (X, X) such that ({fi < a})j S M is an i.i.d. 
sequence of sets with //({/« < a}) = fi(X\{fi > fi}) = p for every i G N. 

We now claim that 3" is not //-Glivenko-Cantelli, which yields the desired con- 
tradiction. To this end, note that we can trivially estimate for any / G 3 



fi l/ >(3 + K_ lf<p < f <a lf <a + K + lf>a- 



We therefore have 




fc=i 
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But if (Xk)k>i are i.i.d. with distribution fi then, by construction, the family of 
random variables {lf j <p(Xk) : j,k G N} is i.i.d. with P[lf <p(Xk) = 0] > 0, so 

1 n 

inf - y^lf. <g (X k ) = a.s. for all n € N. 

fe=l 

Thus 3" is not a /i-Glivenko-Cantelli class. This completes the proof. 

3.2. 4 => 2. Suppose there exists a probability measure /i and e > such that 
iVn (9", e, //) = oo. By Proposition 2.4, there exist levels a < (3 and a set 4 G I 
such that 3" is //-weakly dense over A at levels (a, /?). We will presently construct 
a Boolean cr-independent sequence, which yields the desired contradiction. The 
idea is to repeat the proof of Theorem 1.6, but now exploiting the fact that (X, X) 
is standard to ensure that the infinite intersections in the definition of Boolean a- 
independence are nonempty. 

As (X, X) is standard, we may assume without loss of generality that X is Polish 
and that X is the Borel cr-field. Thus \i is inner regular. We now apply Definition 
2.3 as follows. First, setting p = 1 and B\ = X, choose /i £? such that 

»(A n {/i < a}) > 0, fi(A n {/i > /?}) > 0. 

As p, is inner regular, we may choose compact sets F\ C {/j < a} and G\ C 
{/i > /#} such that //(A n Fi) > and /x(^4 n F 2 ) > 0. Applying the definition 
with p = 2,B\ = F\, and £> 2 = Gi, we can choose / 2 € 9" such that 

m(A n Fi n {/ 2 < «}) > 0, ^ n F 1 n {/ 2 > /?}) > o, 
fi(A n Gi n {/ 2 < a}) > 0, fi(A ndn {/ 2 > /?}) > o. 

Using again inner regularity, we can now choose compact sets F 2 C {/ 2 < a} 
and G 2 C {/ 2 > /3} such that /jl(A n Fi n F 2 ) > 0, n Fi n G 2 ) > 0, 
H{A n Gi PI F 2 ) > 0, and //(vl n Gi n G 2 ) > 0. Iterating the above steps, we 
construct a sequence of functions (/i)ieN ^ 3" and compact sets (Fj)j e pj, (Gj)igN 
such that Fi C {/j < a}, Gi C {/j > /3} for every z G N, and for any n G N 

M[n F J n fi G j) >0 for every Q C {l,...,n}. 

yeQ j€{l,...,n}\Q / 

Now suppose that the sequence is not Boolean cr-independent. Then 

n{/i<«> n n^ > ^>= 
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for some R C N. Thus we certainly have 

n p j n n °i = - 

j€R jm 

Choose arbitrary I £ R (if i? is the empty set, replace Fg by G\ throughout the 
following argument). Then clearly {X\Fj : j £ R} U {X\Gj : j £" R} is an open 
cover of Fg. Therefore, there exist finite subsets Q\ C R, Q 2 C N\i? such that 
{X\Fj : j £ Qi} U {X\Gj : j £ Q 2 } covers F e . But then 

fin H *>n H ^ = 0, 

ieQi jeQ 2 

a contradiction. Thus (/i)ieN is Boolean cr-independent at levels (a, 

3.3. 2 => 1. This is the usual Blum-DeHardt argument, included here for com- 
pleteness. Fix a probability measure \i and e > 0, and suppose that Nn (3~, e, /i) < 
oo. Choose e-brackets [fi,gi], . . . , [/n, 9n] in ^{ji) covering 3". Then 

sup \fi n (f) ~ = sup{/i n (/) - V supj>(/) - Mn(/)} 

/GJ /G? /G5 

< max {/i„(5i) - Kfi)} v . max {/i(£/i) - fi n (fi)}, 

t=l,. ..,N 1=1,. ..,N 

where we define the empirical measure fj, n := i 53fc=i ^& f° r an i-i-d- sequence 
(^fc)fcGN with distribution fi. The right hand side in the above expression is measur- 
able and converges a.s. to a constant not exceeding e by the law of large numbers. 
As e > and /i were arbitrary, 3~ is universal Glivenko-Cantelli. 

3.4. 2 => 3 => 4. As N(&,s,n) < N {] ( J, 2e, /i), the implication 2 => 3 is 
trivial. It therefore remains to prove the implication 3 =^ 4. 

To this end, suppose that there exists a sequence (/i)i<=N ^ & that is Boolean 
cr-independent at levels (a, /3) for some a < f3. Construct the probability measure 
fi as in the proof of the implication 1 => 4. We claim that iV(3~, e, ,u) = oo for 
e > sufficiently small, which yields the desired contradiction. 

To prove the claim, it suffices to note that for any i ^ j 

- fj\) > K\fi ~ fi '/.. ■•.'/. ■ 

> (fi - a) < a} n {/* > /3}) = (/3 - o)p(l - p) > 



by the construction of /i. Therefore 3" contains an infinite set of (/3 — a)p(l — p)- 
separated points in so iV(3~, (/3 — a)p(l — p) /2, //) = oo. 



12 



RAMON VAN HANDEL 



3.5. A remark about a.s. convergence and measur ability. When the class ^is 
only assumed to be separable, the quantity 



may well be nonmeasurable. For nonmeasurable functions, there are inequivalent 
notions of convergence that coincide with a.s. convergence in the measurable case. 
In this paper, following Talagrand [31], we defined /i-Glivenko-Cantelli classes 
as those for which the quantity r n (3~, p) converges to zero a.s., that is, pointwise 
outside a set of probability zero. A different definition, given by Dudley [13, section 
3.3], is to require that F n (J, p) converges to zero almost uniformly, that is, it is 
dominated by a sequence of measurable random variables converging to zero a.s. 

For nonmeasurable functions, almost uniform convergence is in general much 
stronger than a.s. convergence. Nonetheless, in the fundamental paper characteriz- 
ing the /i-Glivenko-Cantelli property, Talagrand showed [3 1 , Theorem 22] that for 
/i-Glivenko-Cantelli classes a.s. convergence already implies almost uniform con- 
vergence. Thus this is certainly the case for universal Glivenko-Cantelli classes. In 
the setting of Theorem 1.3, the latter can also be seen directly: indeed, the proof 
of the implication 1 =>• 4 requires only a.s. convergence, while the Blum-DeHardt 
argument 2 =>■ 1 automatically yields the stronger notion of almost uniform con- 
vergence. 

However, let us note that in Corollary 4.2 below we will prove an even stronger 
property: for separable uniformly bounded classes 3" with finite bracketing num- 
bers, the quantity sup^ g gr \u(f) — p(f) \ is Borel-measurable for arbitrary random 
probability measures u, p. Thus T n (3", p) is automatically measurable for universal 
Glivenko-Cantelli classes satisfying the assumptions of Theorem 1.3, though this is 
far from obvious a priori. Similarly, if any of the equivalent conditions of Theorem 
1.3 or Corollary 1.4 holds, then all the suprema in Corollary 1.4 are measurable. 
It follows that a.s. and almost uniform convergence coincide trivially in our main 
results. 

4. Proof of Corollary 1.4. Throughout this section, we fix a standard mea- 
surable space (X, X) and a separable uniformly bounded family of measurable 
functions 3~. We will prove Corollary 1 .4 by proving the implications 2 <^ 5 and 
2 =>■ {6, 7, 8} => 1. The implication 5 =>■ 2 is related to a result of Tops0e [32], 
though we give here a direct proof inspired by Stute [29]. The remaining implica- 
tions are straightforward modulo measurability issues. 

4.1. 2 <3> 5. The implication 2 5 follows from the Blum-DeHardt argu- 
ment as in section 3.3. Conversely, suppose that condition 2 does not hold, so that 
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iVg (3~, e, fj) = oo for some e > and probability measure fi. Then by Lemma 2.2, 
there exist 5 > and a < /3 such that we can choose for every ir € n(X, X) a 
function /„. G 3" with 

MAr) > 5, D w := d£({f n < a}, {U > /?})• 

We now define for every it G LT(X, X) two probability measures //~ as follows. 
For every P G 7r such that P C Z?^, choose two points x~p € P C\ {f w > 0} and 
a;p€Pn{/ 7r <a} arbitrarily, and define for every A £ X 

l4(A)=»(A\D v ) + Yl KP)U(x$). 

PGTT-.PCD-n 

Then (n^) n £-n(x,x) is a net °f probability measures that converges to \i setwise: 
indeed, for every A G X, we have ^(A) = n(A) whenever tt ■< tta with tta = 
{A, X\A}. On the other hand, by construction we have 

sup \f4(J) - > \f4(U) - n~{U)\ >{fi- aMD w ) >{fi- a)S 

for every vr G U(X, X). Therefore either (/i+) 7re n(x,x) or (/i~)ir£ll(X,X) does not 
converge to \i uniformly over 3~, in contradiction to condition 5. 

4.2. 2 => {6, 7, 8}. The implication 2 => 6 follows immediately from the 
Blum-DeHardt argument as in section 3.3. The complication for the implications 
2 => {7, 8} is that the limiting measure is a random measure (unlike 2 => 6 where 
the limiting measure is nonrandom). Intuitively one can simply condition on 9-oo 
or J, respectively, so that the problem reduces to the implication 2 => 6 under the 
conditional measure. The main work in the proof consists of resolving the measur- 
ability issues that arise in this approach. 

Let 3"o C Jbea countable family that is dense in 9" in the topology of pointwise 
convergence. We first show that 3~o is also L 1 (/x)-dense in 3~ for any fi: this is not 
obvious, as the dominated convergence theorem does not hold for nets. 

Lemma 4.1. IfNj^ (3~, e, fi) < oo for all e > 0, then 3~o is L 1 (ff)-dense in 9". 

Proof. Fix e > 0, and choose e-brackets [fi,gi], . . . , [Jn,9n] in L 1 ^) cov- 
ering 3". As topological closure and finite unions commute, for every / € 3" there 
exists 1 < i < N such that / is in the pointwise closure of [f, gi] n 3"o- But then 
clearly / G and choosing any g G n J we have ^(|/ - g\) < 

^(9i — fi) < e - As e > is arbitrary, the proof is complete. □ 



We can now reduce the suprema in conditions 7 and 8 to countable suprema. 
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COROLLARY 4.2. Suppose that Nn (3~, e, p) < oo for every e > a«J proba- 
bility measure p. Then for any pair of probability measures p, v we have 

su P |M/)-K/)l= sup 

feff feffo 
In particular, this holds when p and v are random measures. 

PROOF. Fix (nonrandom) probability measures p, u, and define p = {p + v}/2. 
Then 3o is L 1 (p)-dense in 3" by Lemma 4. 1 . In particular, for every / G 3" and e > 
0, we can choose je?o such that <7|)+i/(|/— g|) < e. Now let (/ n )neN C 3" 



beasequence such that sup /6 gr \p(f)-v(f)\ = lim^oo \p(f n ) 
f n , choose g n £ J such that /x(|/ n - # n |) + z/(|/ n - g n \) < n 



-v{fn)\- For each 



Then 



sup|M/)-K/)l 



lim \p(g n ) - v(g n )\ < sup \p(f) - u(f)\, 



which clearly yields the result (as 3o C 30- In the case of random probability 
measures, we simply apply the nonrandom result pointwise. □ 

To prove 2 8 we use the ergodic decomposition (cf. Appendix B). Consider a 
stationary sequence (Z n ) ne pj of X-valued random variables on a probability space 
(f2, 9, P). Using Corollary 4.2 and the ergodic theorem, it suffices to prove that 



lim sup sup 

n->>oo f£j 



1 1 

-^/(Z fc )-limsu P -^/(Z fc 

fc=l iV->oo k=l 







1. 



The event inside the probability is an X® N -measurable function of (Z n ) n& fq. There- 
fore, by Theorem B.l in Appendix B, it suffices to prove the result for the case 
that (Z n ) n& j§ is stationary and ergodic. But in the ergodic case J2k=i f(^k) — ► 
E(/(Zo)) a.s., so that the result follows from the Blum-DeHardt argument. 

To prove the implication 2 7, we aim to repeat the proof of 2 8 with a 
suitable tail decomposition (cf. Theorem B.2 in Appendix B). On an underlying 
probability space (f2, S, P), let (S-n)neN be a reverse filtration such that 9-n ^ S 
is countably generated for each n € N, and consider a random variable Z taking 
values in the standard space (X, X). Using Corollary 4.2 and the reverse martingale 
convergence theorem, it evidently suffices to prove that 



lim sup sup 

n-^oo fe r Jo 



E(/(z)|g_ n ) - iimsu P E(/(z)|g_ JV ; 







1. 



If (J7, S) is standard, then by Theorem B.2 it suffices to prove the result for the case 
that the tail tr-field g„oo = f| n 9_ n is trivial. But in that case E(/(Z)|9_ n ) -» 
E(/(Z)) a.s., so that the result follows from the Blum-DeHardt argument. 
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It therefore remains to show that there is no loss of generality in assuming that 
(Q, 9) is standard. To this end, choose for every n > 1 a countable generating 
class (H n> j)j£^ C S-n> and define the {0, l} N -valued random variable Z_ n = 
(lij„ )jeN- Then, by construction, 9-n = a {Z-k '• k — n }- If we define Zo = Z, 
then it is clear that the implication 2 =>■ 7 depends only on the law of (Z_ n ) n >o- 
There is therefore no loss of generality in assuming that 9) is the canonical 
space of the process (Z_ n ) n >o, which is clearly standard as {0, 1} N is Polish. 

4.3. {6, 7, 8} =^ 1. These implications follow from the fact that each of the 
conditions {6, 7, 8} contains condition 1 as a special case. For the implication 
6 => 1, it suffices to choose fj, n to be the empirical measure of an i.i.d. sequence 
with distribution //. Similarly, the implication 8 =4> 1 follows from the fact that 
an i.i.d. sequence is stationary and ergodic. Finally, the implication 7 =>■ 1 fol- 
lows from the following well known construction. Let (Xk)keN be an i.i.d. se- 
quence of X- valued random variables with distribution [i, let Z = X\, and let 
9_ n = a {Ylk=l ^-A(Xk) ■ A G X}. As (X, X) is standard, X and hence 9-n are 
countably generated. Moreover, we have 

_. n i n 

E(/(Z)|S-n) =E(/(X,)|9_ n ) = -5>(/(X fc )|S-n) = -E/M 

k=l k=l 

for any bounded measurable function / and 1 < I < n, as the right hand side 
is 9-n-measurable and every element of 9-n is symmetric under permutations 
of {Xi, . . . , X n }. Therefore, - Ylk=i ^ s a vers ion of the regular conditional 
probability P(Z € • |9-n) for every n > 1. By the law of large numbers and 
the martingale convergence theorem, it follows that \i is a version of the regular 
conditional probability P(Z 6 ■ |9-<x>)- The implication 7 => 1 is now immediate. 

5. Proof of Proposition 1.7. The construction of the class S in Proposition 
1.7 is based on a combinatorial construction due to Alon, Haussler, and Welzl [6, 
Theorem A(2)]. We begin by recalling the essential results in that paper, and then 
proceed to the proof of Proposition 1.7. 

5.1. Construction. Let q > 2 be a prime number, and denote by ¥ q the finite 
field TLjqL of order q. In the following, we consider the three-dimensional vector 
space over the finite field ¥ q . Denote by V q the family of all one-dimensional 
subspaces of F^, and denote by E q the family of all two-dimensional subspaces 
of F^. Each element of E q is identified with a subset of V q by inclusion, that is, 
a two-dimensional subspace C € E q is identified with the set of one-dimensional 
subspaces x G V q contained in it. An elementary counting argument, cf. [9, section 
9.3], yields the following properties: 



16 



RAMON VAN HANDEL 



1. card V q = card E q = q 2 + q + 1. 

2. Every set C £ E q contains exactly q + 1 points in V q . 

3. Every point x € V q belongs to exactly q + 1 sets in E q . 

4. For every x, x' £ V q , x 7^ x' there is a unique set C £ E q with x, x' € C. 

A pair (V^, with these properties is called & finite projective plane of order q. 
For our purposes, the key property of finite projective planes is the following result 
due to Alon, Haussler, and Welzl, whose proof is given in [6, p. 336] (the proof is 
based on a combinatorial lemma proved in [4, Theorem 2. 1(2)]). 

PROPOSITION 5.1. Let q > 2 be prime, define m = q 2 + q + 1, and let e > 0. 
Then for any partition TrofV q such that (card7r) 2 < m l / 2 (l — e), we have 

card d^C 
max > e. 

CeE q m 

Here we defined the ir-boundary d^C := \J{P £ it : P H C 7^ and P % C}. 

We now proceed to construct the class C in Proposition 1.7. Let qj t 00 be an 
increasing sequence of primes (qj > 2), and define nij = q 2 + qj + 1. We now 
partition N into consecutive blocks of length nij, as follows: 

00 ( j-i j "I 

n=U% a^= j> i + 1 '---'E m * r^- 

j=i U=i 1=1 j 

Define C as the disjoint union of copies of E q . defined on the blocks Nj : that is, 
choose for every j a bijection ij : V qj — > Nj, and define 

00 

C = (J Qj, Qj = {B C Nj : ^(B) € ^ }. 
j'=i 

We claim that the countable class Q of subsets of N has 7-dimension two. 

LEMMA 5.2. Q has Vapnik-Chervonenkis dimension two. 

PROOF. Choose any three distinct points n\, n?,, TI3 € N. If two of these points 
are in distinct intervals Nj, then no set in Q contains both points. On the other hand, 
suppose that all three points are in the same interval Nj. Then by the properties of 
the finite projective plane, either there is no set in Q that contains all three points, 
or there is no set that contains two of the points but not the third (as each pair of 
points must lie in a unique set in C). Thus we have shown that no family of three 
points {ni, n2, 713} is 7-shattered for < 7 < 1. On the other hand, it is easily 
seen that the properties of the finite projective plane imply that any pair of points 
{711,712} belonging to the same interval Nj is 7-shattered for < 7 < 1. □ 
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5.2. Proof of Proposition 1.7. The following crude lemma yields lower bounds 
on the bracketing numbers. 

Lemma 5.3. Let \ibe a probability measure on N. Then 

inf sup fi(d n C) > e implies Nn (C, s, ff) > N, 

cardir<3 N CeC 

where the infimum ranges over all partitions ofN with card7r < 3^. 

Proof. Suppose N^(e, e, n) < N. Then there are k < N pairs {Cf , C~}i< k 
of subsets of N such that p(Cf\C^) < e for all 1 < i < k, and for every C G Q, 
there exists 1 < i < k such that C~ C X C Cf. Let tt be the partition generated 
by {C 4 + , C^ - : 1 < i < k}. Then cardvr < 3^, as tt is the common refinement of 
at most N partitions {C~ , Cf\C7~, N\C+ } of size three. 

Now choose any C G C, and choose 1 < i < k such that C~ C C C C^~. 
As Cj~ and N\Cj + are unions of atoms of 7r by construction, and as C 4 ~ C C and 
(N\C+) n C = 0, we evidently have d^C C C+\C". Thus /i^C) < e. As this 
holds for any C G C, we complete the proof by contradiction. □ 

Denote by fij the uniform distribution on Nj. Let (j?j)j6N be a sequence of 
nonnegative numbers pj > so that pj = 1, and define the probability measure 

oo 

i=i 

We first obtain a lower bound on A/n (C, s, /i). Subsequently, we will be able to 
choose the sequence (pj)jeN such that this bound grows arbitrarily quickly. 

To obtain a lower bound, let us suppose that -/Vn(6, e, /i) < iV. Then applying 
Lemma 5.3, there exists a partition tt of N with card tt < 3 N such that 

sunn, min max C& - — — < supp,- max rij(d n C) < sup u(d n C) < e. 
By Proposition 5.1, 

card^/C e . i/ 4 f e ^ 
mm max < — implies m • ,/l A 1 < 3 . 



cardTr'^^ C&E qj 171 j Pj 3 \j pj 

Therefore, (C, e, /f) < N implies that 

N > \ log 3 mj + X - log 3 ^1 - j A 1 
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for every j G N. It follows that 



Nn(e,e,n) > sup 



log 3 rrij + - log 3 1 



A 1 



P., 



This bound holds for any choice of (pj)jeN- 

Fix n(e) f oo as e \. 0. We now choose (y»j)jgN such that iVn(C, e, /x) > n(e). 
First, as f oo, we can choose a subsequence j'(fc) f oo such that 



m j(Llog 2 (2/3e)J) 



> 3 4n( £ )+6 forall0<e< 1/3. 



Now define (pj)jgN as follows: 

p j{k) = 2~ k for fceN, pj = for j £ {j(k) : A; € N}. 
Then we cleai'ly have, setting J(e) = j(Llog 2 (2/3e)J), 



JVn(e,e,/i) > 



- log 3 m/ (s) + - log 3 1 



PJ( 6 ) 



A 1 



> [n(e) + 1J > n(e) 



for all < e < 1/3. This completes the proof. 



Appendix A Boolean and stochastic independence. An essential property 
of a Boolean cr-independent sequence of sets is that there must exist a probabil- 
ity measure under which these sets are i.i.d. This idea dates back to Marczewski 
[23], who showed that such a probability measure exists on the a-field generated 
by these sets. For our purposes, we will need the resulting probability measure to 
be defined on the larger a -field X of the underlying standard measurable space 
(X, X). One could apply an extension theorem for measures on standard measur- 
able spaces (for example, [34, p. 194]) to deduce the existence of such a measure 
from Marczewski 's result. However, a direct proof is easily given. 

Theorem A.l. Let (X, X) be a standard measurable space. Let {Ai, Biji^ 
be a sequence of pairs of sets Ai,Bi £ X such that Ai fl B{ = for every i E N 
and 

P) Aj n P) Bj ^ for every F C N. 

j£F j<?F 

Let p € [0, 1]. Then there exists a probability measure \x on (X, X) such that 
p-(Ai) = fi(X\Bi) = p for every i £ N, and such that (Ai)i^ are independent 
under p. 
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PROOF. Let 23* be the universal completion of the the Borel cr-field of {0, 1} , 
and let Cj = {cj £ {0, 1} N : cjj = 1} for j G N. Moreover, let v be the probability 
measure on 2* under which (Cj)j^ are independent and u(Cj) = p for every 

j e N. 

Define for every oj G {0, 1} N the set 

It suffices to show that there is a measurable map i : ({0, 1} N ,23*) — > (X, X) 
such that l(uj) G H(ui) for every oj G {0, 1} n . Indeed, as ^(Aj) = Cj and 
r x {Bj) = {0, l} N \Cj for every j G N, the measure //(•) = has the 

desired properties. 

It remains to prove the existence of i. To this end, note that the set 

T = {(uj,x) : x G H(u)} = P| \Cj x Aj U ({0, 1} N \C^ x Bj} 

is measurable T G S({0, 1} N ) ® X, where B({0, 1} N ) denotes the Borel cr-field of 
{0, 1} N . As H(ui) is nonempty for every uj G {0, 1} N by assumption, the existence 
of i now follows by the measurable section theorem [11, Theorem 8.5.3]. □ 

Remark A.2. In the above proof, the assumption that (X, X) is standard is 
required to apply the measurable section theorem. When (X, X) is an arbitrary 
measurable space, we could of course invoke the axiom of choice to obtain a map 
l : {0, 1} N -> X such that l(uj) G H(uj) for every uj G {0, 1} N , but such a 
map need not be measurable in general. On the other hand, as t~ 1 ( J 4j) = Cj 
and L~ 1 (Bj) = {0, l} N \Cj, it follows that t is necessarily Borel-measurable if 
we choose X = a{Aj, Bj : j G N}. Thus we recover a result along the lines of 
Marczewski by using the same proof. 

The proof of Theorem 1.6 uses the following connection between Boolean in- 
dependence and 7-shattering which is a trivial modification of a result of Assouad 
[7] (cf. [13, Theorem 4.6.2]). We give the proof for completeness. 

LEMMA A. 3. Let {/i, . . . , fan} be a finite family of functions on a set X that is 
Boolean independent at levels (a, f3) with (3— a > 7. Then the family {/1 , . . . , /21 } 
^-shatters some finite subset {x\, . . . ,x n } C X. 

Proof. Define 1(F) = 1 + EjeF 2 ^ 1 for F £ {l,...,n}, so that 1(F) 
assigns to every F C {1, . . . , n} a unique integer between 1 and 2 n . Choose some 
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point 

xj e p| {f m < a} n p {f m > p) 

for every j = 1, . . . , n. Then for any F C {1, . . . , n}, we have fe(F)( x j) < a if 
j £ F and fe(F)i x j) > ft H j ^ F. Therefore {x%, . . . ,x n } is 7-shattered. □ 

Appendix B Decomposition theorems. Part of the proof of Corollary 1.4 
relies on the decomposition of stochastic processes with respect to the invariant 
and tail cr-fields. These theorems will be given presently. 

The first theorem is the well-known ergodic decomposition. As this result is 
classical, we state it here without proof (see [35, Theorem 6.6] or [19, Theorem 
10.26], for example, for elementary proofs). In the following, for any standard 
space (Y, y), we denote by 7(Y, y) the space of probability measures on (Y, y). 
The space "P(Y, V) is endowed with the c-field generated by the evaluation map- 
pings 7Tb : h-> p(B), -Bey. Recall that if (X, X) is standard, then so is 
(X N ,X® N ). 

THEOREM B.l. Let (X, X) be a standard space, and denote by (Z n ) ng ^ the 
canonical process on the space (X^, X® N ). Let p S X® N ) be a stationary 

probability measure. Then there exists a probability measure p on T(X N ,X® N ) 
such that 

p(A) = Jv(A)p(dv) for every A G X® N , 

and such that there exists a measurable subset B of X® N ) with p{B) = 1 

and with the property that every v € B is stationary and ergodic. 

The second theorem is similar in spirit to Theorem B.l, where we now decom- 
pose with respect to the tail a -field rather than with respect to the invariant <7-field. 
This result is closely related to the decomposition theorem for Gibbs measures 
(see, for example, [16]). For completeness, we provide a self-contained proof. 

THEOREM B.2. Let (f2, S, p) be a standard probability space. Let (9-n)neN 
be a reverse filtration with each 9_ n f= S countably generated. Fix for every n £ N 
a version p_ n of the regular conditional probability p{ ■ |S-n)- Then there exists a 
probability measure p on !P(f2, S) such that 

p(A) = J v(A) p{dv) for every A G S, 

and such that there is a measurable subset B of J > (17, S) with p(B) = 1 and 
1. The tail a-field 9-oo = Pl n S-n is v -trivial for every v £ B. 
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2. ^(AjS-n) = P-n(A) v -a. s. for every v G B, A G 9, andn G N. 

PROOF. Let /x_oo be a version of the regular conditional probability p( ■ |9-oo), 
whose existence is guaranteed as (0, 9) is standard. We consider //-oo : & — ► 
CP(r2, S) as a S-oo -measurable random probability measure uj h-> /U^^ in the usual 
manner (e.g., [19, Lemma 1.40]). Let p G J , (J > (0, S)) be the law under p of the 
random measure /i-oo- It follows directly from the definition of regular conditional 
probability that 

p{A) = j M-oo(^) M(^) = / v{A) p{dv) for every A G 9. 

It remains to obtain a set B with the two properties in the statement of the theorem. 
We begin with the second property. Note that 

\v(l c p-n{A))-v{A^C)\p(dv) = 

— n) 1 9— oo 

for every n G N, A G 9, and C G 9-n- Let 9° n be a countable generating algebra 
for 9-n and let 9° be a countable generating algebra for 9- Evidently 



/ 



IcH P- n (A) v{duj) = v{A n C) for every n G N, A G 9°, C7 G 9^ 



for all in a measurable subset i?o of 9) with p(Bq) = 1. But the monotone 
class theorem allows to extend this identity to all A G 9 and C G 9-n- Thus we 
have u(A\S- n ) = ^-n(^4) ^-a.s. for every v G £>o> A £ and ?i G N. 
We now proceed to the first property. For any A G 9, we have 



^(i/^ig-oo) = v(A))p(dv) = v\ limsup^„ n (^) = v{A) p{dv) = 

J \ n— >oo / 

pi lim.supp- n (A) = ^(^(g.oo) ) = 1, 



where we have used the martingale convergence theorem and the previously estab- 
lished fact that u(p^ n (A) = ^(^419-n) for all n G N) = 1 for p-a.e. v. Therefore, 
it follows that u(A\S~oo) = v{A) z^-a.s. for all A G 9° for every u in a mea- 
surable subset B\ of T(0,9) with p(B\) = 1. By the monotone class theorem 
^(j4|9-oo) = K-A) I'-a.s. for every v £ B\ and j4 G 9. But then evidently 9-oo is 
z^-trivial for every v £ B\. Choosing B = Bq n -Bi completes the proof. □ 
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Appendix C Counterexamples in nonstandard spaces. The assumption 
that (X, X) is standard is used in the proof of Theorem 1.3 to establish the im- 
plications 1,3 =>■ 4 and 4 2. The goal of this appendix is to show that these 
implications may indeed fail when (X, X) is not standard. To this end we provide 
two counterexamples, based on the following simple observation. 

Lemma C.l. There exists a Boolean a -independent sequence of functions on 
a set X if and only j/card X > 2^°. 

PROOF. Suppose there exists a Boolean cr-independent sequence (fj)j^n of 
functions fj-.X—> R. Then there exist a < (3 such that for every F C N, 
the set 

j&F jgF 

contains at least one point. As these sets are disjoint for distinct FCff, and there 
are 2 K ° subsets of N, it follows that c&rdX > 2 K °. Conversely, if cardX > 2 H °, 
there exists an injective map i : {0, 1} N — > X. Define the sets Cj = : to € 

{0, 1} N , LJj = 1} C X. Then the sequence (lc^jeN is Boolean a-independent. 

□ 

Both examples below are consistent with the usual axioms of set theory (that 
is, the set theory ZFC) but depend on additional set-theoretic axioms. I do not 
know whether it is possible to obtain counterexamples in the absence of additional 
axioms. 

C.l An example where 1,3^4. Let X be an uncountable Polish space, and 
let X be the universal completion of its Borel cr-field. Then (X, X) is certainly not a 
standard measurable space. It is known, see Sierpihski and Szpilrajn [28], that there 
exists a set A G X with card A = that is universally null, that is, fJ,(A) = 
for every nonatomic probability measure p, on X. As every subset C C A is in the 
/^-completion of the Borel cr-field of X for every probability measure /j,, it follows 
that C e X for every C Q A. 

As is noted by Dudley, Gine and Zinn [15, p. 494], the family of indicators 
3~A = {lc : C C ^4} is a universal Glivenko-Cantelli class. Moreover, as A is a p,- 
null set for every nonatomic probability measure, it is evident that N^a, £, A*) = 
N(3~a, £, /U a t) < co for every e > and probability measure /i, where p &t denotes 
the atomic part of \i. But assuming the continuum hypothesis, we have card^4 = 
2^° and therefore 3~a contains a Boolean cr-independent sequence 3~ by Lemma 
C.l. Clearly 3" is a separable uniformly bounded family of measurable functions 
on (X, X) for which the implications 1, 3 => 4 of Theorem 1.3 fail. 
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Remark C.2. The existence of a universally null set does not require the con- 
tinuum hypothesis: Sierpiriski and Szpilrajn [28] construct such a set in ZFC (the 
construction follows directly from Hausdorff [17], see also [22, Theorem 1.2]). 
Nonetheless, the present counterexample does depend on the continuum hypothe- 
sis and may fail in its absence. Indeed, there exist models of the set theory ZFC 
in which every universally null set has cardinality strictly less than 2^° , see Laver 
[22, p. 152], Miller [26, pp. 577-578], or Ciesielski and Pawlikowski [10, p. xii 
and Theorem 1.1.4]. In such a model, J a cannot contain a Boolean cr-independent 
sequence by Lemma C.l. 

C.2 An example where 4 2. The present counterexample follows from the 
following result that is proved below. 

PROPOSITION C.3. It is consistent with the set theory ZFC that there exists 
a probability space (X, X, p) with card X < 2^° such that there is a sequence 
of sets (Cj)jgpj C X that are independent under p with p(Cj) = 1/2 for every 
jeK 

This result easily yields the desired example. Let (X, X, p) and (Cj)j^ be as 
in Proposition C.3, and define the class 3~ = {1q. : j £ N}. The proof of the 
implication 3 =4- 4 of Theorem 1.3 shows that Nn(JF,e,p) > N(J,e,p) = oo 
for e > sufficiently small. On the other hand, 2F cannot contain a Boolean a- 
independent sequence by Lemma C.l. Thus 3" is a separable uniformly bounded 
family of measurable functions on (X, X) for which the implication 4 => 2 of 
Theorem 1.3 fails. 

Remark C.4. It is clear that the present counterexample must depend on 
a model of set theory in which the continuum hypothesis fails. Indeed, the set 
X in Proposition C.3 must be uncountable as it supports a (stochastically) inde- 
pendent sequence. Therefore, if we assume the continuum hypothesis, then nec- 
essarily card X > 2 and we cannot guarantee the nonexistence of a Boolean 
cr-independent sequence. 

Denote by A the Lebesgue measure on [0, 1], and denote by A* the Lebesgue 
outer measure. The proof of Proposition C.3 is based on the following remarkable 
fact: there exist models of the set theory ZFC in which there is a subset X C [0, 1] 
with cardX < 2 H ° such that X*(X) > 0; see Martin and Solovay [24, section 4.1], 
Kunen [21, Theorem 3.19], or Judah and Shelah [18]. The existence of such a set 
X will be assumed in the proof of Proposition C.3. Note that the set X cannot be 
Lebesgue measurable (if X were measurable it must contain a Borel set of positive 
measure, which has cardinality 2 N ° by the Borel isomorphism theorem). 
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Proposition C.3 . Assume a model of the set theory ZFC in which there exists 
a set X C [0, 1] with caiAX < 2 H ° such that X*(X) > 0. Let X be the trace of 
the Borel ex-field of [0, 1] on X, that is, X = -L4 n X : A G S([0, 1])}. Choose a 
measurable cover X of X, and note that A n X is a measurable cover of yl n X 
whenever A G 23([0, 1]). We may therefore unambiguously define fi(A n X) = 
A(A n X)/A(A > ) for A G B([0, 1]), and it is easily verified that fi is a probability 
measure on (X, X) whose definition does not depend on the choice of X. 

We now claim the following: for every set C G X with n{C) > 0, there exists 
a set C' G X, C C C with /i(C") = n(C)/2. Indeed, let C = AnX for 
some A G B([0, 1]). As the function ^ : f 14 A(A nlfl [0,t]) is continuous 
and 0(0) = 0, 0(1) = A(^4 n X), there exists by the intermediate value theorem 
< s < 1 such that 4>(s) = X(A n X)/2. Therefore C' = Cn [0, s] yields the 
desired set. 

Now inductively define for every n > 1 and cj G {0, l} n a set A w G X as 
follows. For n = 1, choose a set Aq G X such that /i(Ao) = 1/2, and define 
A\ = X\Aq. For n > 1, choose for every uj G {0, l}™" 1 a set A^o G X such that 
A w q C A w with /i(A, ) = n(Aw)/2, and define A^i = A,,\^o- Finally, define 
for every n > 1 

wG{0,l} n :w„=0 

Then fi(C n ) = 1/2 for every n > 1, and ^(C il n • • • nQJ = 2" fe for every fc > 1 
and 1 < ii < i2 < ■ ■ ■ < ik- This evidently completes the proof. □ 
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